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Abstract 


The ratio of the values of optimal integer and fractional solutions 
to a set covering problem was shown by Johnson [51 and Lovasa [6[ to be 
bounded by B(d) * 1 + 0nd, where d is the largest column sum. We show 
that if n is the number of variables, B(n) = — 1 ~~ { I is\ best possible 


bound on this ratio. Furthermore, for every n > 20 there are problems for 
which B(n) < B(d). 


( 



OPTIMAL INTEGER AND FRACTIONAL 
COVERS: A SHARP BOUND ON THEIR RATIO 

by 

Egon Balas 

The simple (unweighted) set covering problem is 
(C) z c * minfe^xjAx > e^, x binary}, 

where A is an m X n 0-1 matrix and for k = m, n, is the k-vector whose 
components are all equal to 1, while x is an n-vector of variables. 

If the 0-1 condition on the variables is relaxed to nonnegativity, we 
obtain the continuous or fractional set covering problem 

(F) z_ = minfe xjAx > e , x > Oj. 

A vector x that satisfies the constraints of (C) (of (F)) will be called 
a cover (fractional cover ). 

The set covering problem is known to be NP-complete. One of the best 
known procedures for finding a cover that approximates the optimum is the greedy 
heuristic, which consists of a sequence of steps, each of which assigns the 
value 1 to a variable whose column covers a maximal number of additional rows. 

The worst case behavior of the greedy heuristic for the (unweighted) set covering 
problem was shown by Johnson [5] and LoWsz [6] to be given by the relation 
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where z is the value of a cover obtained by the greedy heuristic, 

G 

m 

d = max Z a.., 
je{l,...,n} i=l 1J 

and 

d ! 

H(d) = E 7 . 
j=l J 

Thus the ratio between the value of a ’’greedy" cover and that of an optimal 
fractional cover increases at most with the logarithm of the largest column sum. 

Chvatal [2] has shown that the worst case bound given by (1) is also 
valid for the greedy heuristic when applied to the weighted set covering problem 
with arbitrary but positive cost coefficients c^, j = l,..,,n. If kj t repre¬ 
sents the number of new rows covered by column j at step t, the greedy heuristic 
for the weighted set covering problem assigns the value 1 at step t to a variable 
Xj whose choice maximizes k^/Cj. Furthermore, Ho [3] has shown that the bound 
given by ( 1 ) is best possible for any (weighted) set covering heuristic that 
assigns the value 1 at step t to a variable x^ whose choice maximizes some 
arbitrary function f(c,, k ). 

J J £ 

Another class of heuristics, which uses information (reduced costs) 
obtained from a (not necessarily optimal) solution to the dual linear program, 
has consistently outperformed in empirical tests the greedy heuristic and its 
above mentioned generalizations (see Balas and Ho [1]), but no worst case 
bound better than (or comparable to) (1) is known for it (see Hochbaum [4] for 
a discussion of bounds for this heuristic). 

Sinze z G > > z^,, the relation (1) implies of course both 

z r 

T Z H(d) 

Z C 


( 2 ) 
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(3) -r- < H(d) . 

Z F 

However, while H(d) is a best possible bound for both z ^/and z G /z c , 
it was until recently an open question whether it is also a best possible bound 
for zjz , since no better bound than H(d) was known for this latter ratio. 

In this paper we give a best possible bound on the value of z q/ z f for 
unweighted set covering problems, as a function of the number n of columns, for 
an arbitrary number of rows. For every value of n, there are problems for which 
this bound has a value of approximately H(d). 

For an arbitrary 0-1 matrix A, we will denote by z q^ a ^ and z p(A) t * ie 
value of an optimal solution to the (unweighted) set covering problem defined 
by A, and to the fractional set covering problem defined by A, respectively. 

Let d R denote the class of 0-1 matrices with at most n columns, and let 

j^ n (p) - {Aejtf n |z ... = p} . 


Theorem 1 . For any positive integer n and any pe[l,...,n}. 


(4) min z-,,.. - , 

Aei (p) F(A) n 


and the minimum in (4) is attained for the X n matrix A* whose rows are all 
the distinct 0-1 n-vectors with exactly n-p+1 components equal to 1. 

Proof . We first show that A*e^ n (p). A* has n columns by assumption. 

Any binary n-vector x having at least p components equal to 1 satisfies A*x > e^, 
where q ■ since no row of A* has more than p-1 entries equal to 0. Further, 

every binary n-vector x with at most p-1 components equal to 1 violates the 
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inequality corresponding to that particular row of A*, whose p-1 entries equal 
to 0 include those positions where x = 0. Thus z , . = p, i.e., A*e^ n (p). 

j L(A*j 

Next we show that Z F (- A *) = n/(n-p+l). Let k = n-p+1, and let x be 
defined by x^ = 1/k, j = l,...,n. Let B be any n X n nonsingular submatrix 
of A*, such that every column of B has exactly k entries equal to 1. The 
definition of A* guarantees the existence of B. Now let u be the q-vector 

^ th j— 

defined by = 1/k if the i row of A* is a row of B, u^ » 0 otherwise. Then 

x and u are feasible solutions to the linear program min{e x|A*x > e_, x > 0} 

n — 4 — 

and its dual, respectively, with value e^x = e^u = n/k. Hence x is an optimal 
fractional cover, and = = n/(n-p+l). 

Finally, we show that A* minimizes z , over^"(p). Assume this to be 

F (A) 

false, and let A° be a matrix that minimizes ^(k) overi/ n (p), with z F ( A o^ < z F (a*) 
Also, let A* = (a*j), A° * (a°j). W.l.o.g., we may assume that A° has n columns, 
since adding columns whose entries are all equal to 0 does not change either the 

integer or the fractional optimum. For every Sc{l .n] such that |s| = p-1, 

A° has a row i such that a°^ = 0, V- jeS; or else x defined by x^ ~ 1, jeS, 
x j * 0 * j^S, would be a cover with value p- 1 , contrary to the assumption that 
A° e^ n (p). Hence for every row i of A*, A° has a row k such that a° < a* 

Kj ij 

j * l,...,n. But then x > 0, A°x > e^ implies A*x > e^ (where r is the number 
of rows of A°), hence z -p(A*) — Z F(A°) * 3 contradiction '!l 
Theorem 2 . For any A ed n , 


C(A) 1 | n+1 j rn»l~! 

z ™ - n L 2 J 2 I 


and this is a best possible bound. 



Proof. For fixed pe{l,...,n}, from Theorem 1 


( 6 ) 


max £ (n-p+ 1 ). 

Ae/(p) Z F(A) n 


If p is allowed to vary continuously in the interval [1, n], the right 

hand side of ( 6 ) is concave and attains its maximum for p = (n+l)/2. Since p 

n+l j 


r n+l" 


has to be integer, the maximum is attained either for p = j j, or for p = 

i- 2 J 12 

namely. 


Z C(A) Cl I n+l 1 / 1 n+l ' . , N 1 rn+11 ( l - n+l' 

max —- max t n- : + 1 — —— ! ! n - —r— 

Ae/ Z F(A) ln L 2 J W 2 J /’ n . 2 ■ \ .2 


+ 1 


1 ! n+l r n+l~: 

n L 2 J I 2 i 


Another expression for the above bound is given by 


(7) 


1 I n+l : P n+l "J 


n L 2 J I 2 


! = < 


n + l 

4 + 2 


if n is even 


£ + r + t- 
4 


4n 


if n is odd. 


Thus, the n variables set covering problem for which the ratio z„,../z„... 

C(A) F(A) 


attains its maximum, is the one whose coefficient matrix has exactly [j+p- l's in 


! n+l : 


every row, and contains as a row every binary n-vector with j components 

2 n 


Tn+l -5 

equal to 1. For this problem, = j -j~ | and z p(A) = , where 5 = 0 


if n is even and 6 = 1 if n is odd. 

Before concluding our paper, we compare the bound on z c(a)^ z f(A) 8 * ven 
in Theorem 2, with the bound on z g(A)^ z F(A) given by (1). To do this, we note 
that when we consider the bound H(d) given by (1) for all set covering problems 


defined by matrices A t», the largest d that can occur (provided A has no 
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componentwise equal rows), happens to occur for the matrix A* having 
all possible 0-1 n-vectors with exactly jj^—j components equal to 1. 
matrix, we denote d(A*) = d*, and we have 


d* = 


n-1 


! n+1 ! 

L 2 J " 




We want to assess the value of the ratio 


( 8 ) 


_ 1 + 2/n d* 

R = 1) U±I fn+l~: 
ni 2 Jl 2 i 


Theorem 3 . For n > 2, 

(9) . 

Proof . From (8), we have 



Using Stirling's formula as refined by Robbins, 

<,V<2n,) 1/2 e 1/<12 ’ +1 > < ,1 < ,V<<z W ,) l 'V /12 < , 


as rows 
For this 


we have 
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, I n-1 I , f n-l ~l , , » n-1 . n-1 

Using j_—J + [ 2 i = n_1 and ^ | n-1 j - fn-11 * 

L 2 J ! 2 I 


we obtain 


(ID 


. (n+1) (n-1) , n-1 _ n L n-1 . n-1 

> j n+1 fit+T] ** r n -r ( + 1 n+l fn+r 6 " n * n fn-l] 

ltJt! It i ltjti \ ! 2 1 


As the last term Is nonnegative for n > 2, and 


t n+1 I fn+1 " a (n+1) P n-l" 1 n 

L 2 J I 2 1 — 4 ’ '21—2 » 


inequality (11) implies (9). || 

The value of the righthand side in (9) is 2.5 for n = 20, and it 
approaches the constant 4 in 2 ~ 2.769 as n goes to infinity. Thus for the 
problems for which d = d*, the bound on z^/Zj, is about 1/2.7 of the bound on 
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